Preprocessing articles
Converting your scopus collection into a bibliographic dataframe
Warning in convert2df(file = files[1], dbsource = "scopus", format = "csv") :
NAs introduced by coercion
Done!
Generating affiliation field tag AU_UN from C1: Done!
Converting your scopus collection into a bibliographic dataframe
Done!
Generating affiliation field tag AU_UN from C1: Done!
Networks Bibliographic
Restrict the network
Merge with main data
Aggregated Network
Network Cocitation
Warning in gzfile(file, "rb") :
cannot open compressed file '../temp/temp/mat_cit_liser_ud.rds', probable reason 'No such file or directory'
Error in gzfile(file, "rb") : cannot open the connection
A ggregated Network
2 mode network
Joining, by = "XX"
Warning: The `x` argument of `as_tibble.matrix()` must have unique column names if `.name_repair` is omitted as of tibble 2.0.0.
Using compatibility `.name_repair`.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
Topicmodel
Loading required package: koRpus.lang.en
Loading required package: koRpus
Loading required package: sylly
For information on available language packages for 'koRpus', run
available.koRpus.lang()
and see ?install.koRpus.lang()
Attaching package: ‘koRpus’
The following object is masked from ‘package:tm’:
readTagged
The following object is masked from ‘package:readr’:
tokenize
Warning: Expected 3 pieces. Missing pieces filled with `NA` in 2059779 rows [1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, ...].
fit models...
During startup - Warning messages:
1: Setting LC_COLLATE failed, using "C"
2: Setting LC_TIME failed, using "C"
3: Setting LC_MESSAGES failed, using "C"
4: Setting LC_MONETARY failed, using "C"
During startup - Warning messages:
1: Setting LC_COLLATE failed, using "C"
2: Setting LC_TIME failed, using "C"
3: Setting LC_MESSAGES failed, using "C"
4: Setting LC_MONETARY failed, using "C"
During startup - Warning messages:
1: Setting LC_COLLATE failed, using "C"
2: Setting LC_TIME failed, using "C"
3: Setting LC_MESSAGES failed, using "C"
4: Setting LC_MONETARY failed, using "C"
During startup - Warning messages:
1: Setting LC_COLLATE failed, using "C"
2: Setting LC_TIME failed, using "C"
3: Setting LC_MESSAGES failed, using "C"
4: Setting LC_MONETARY failed, using "C"
done.
calculate metrics:
Griffiths2004... done.
CaoJuan2009... done.
Arun2010... done.
Deveaud2014... done.
Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> = "none")` instead.


Warning in if (class(X) == "dist") { :
the condition has length > 1 and only the first element will be used
sigma summary: Min. : 33554432 |1st Qu. : 33554432 |Median : 33554432 |Mean : 33554432 |3rd Qu. : 33554432 |Max. : 33554432 |
Epoch: Iteration #100 error is: 18.2768141242953
Epoch: Iteration #200 error is: 0.411823265102699
Epoch: Iteration #300 error is: 0.304590069644955
Epoch: Iteration #400 error is: 0.276234760744753
Epoch: Iteration #500 error is: 0.268934345793253
Epoch: Iteration #600 error is: 0.268422920995153
Epoch: Iteration #700 error is: 0.268375755597861
Epoch: Iteration #800 error is: 0.268374031987987
Epoch: Iteration #900 error is: 0.268373919645058
Epoch: Iteration #1000 error is: 0.268373821697181
Loading required namespace: servr
To stop the server, run servr::daemon_stop(1) or restart your R session
Serving the directory /private/var/folders/34/_4xm8_d96wl8vhw6nq91nrhw0000gn/T/RtmpFkHnAZ/file16e3a747ce168 at http://127.0.0.1:4321
To stop the server, run servr::daemon_stop(2) or restart your R session
Serving the directory /Users/dsh/OneDrive - Aalborg Universitet/01 - Research/project_2022_biblio_lux_eval/biblio_lux_2022_github/output/LDAviz_liser_ud.rds at http://127.0.0.1:4230
Local citations
Warning in rm(CR, CRL) : object 'CRL' not found
Historical citation
SCOPUS DB: Searching local citations (LCS) by document titles (TI) and DOIs...
Legend
$net
IGRAPH d370a89 DN-- 37 61 --
+ attr: name (v/c), title (v/c), id (v/c), size (v/n), years (v/n), color (e/c)
+ edges from d370a89 (vertex names):
[1] ZARABI Z, 2019, SUSTAINABILITY ->GERBER P, 2020, TRANSP RES PART F TRAFFIC PSYCHOL BEHAV
[2] ZARABI Z, 2019, SUSTAINABILITY ->MA T-Y, 2021, TRANSP RES PART D TRANSP ENVIRON-a
[3] ZARABI Z, 2019, SUSTAINABILITY ->KAMRUZZAMAN M, 2021, J TRANSP GEOGR
[4] GERBER P, 2020, TRANSP RES PART F TRAFFIC PSYCHOL BEHAV->MA T-Y, 2021, TRANSP RES PART D TRANSP ENVIRON-a
[5] DE VOS J, 2019, TRANSP RES PART F TRAFFIC PSYCHOL BEHAV->ZARABI Z, 2019, SUSTAINABILITY
[6] DE VOS J, 2019, TRANSP RES PART F TRAFFIC PSYCHOL BEHAV->GERBER P, 2020, TRANSP RES PART F TRAFFIC PSYCHOL BEHAV
[7] DE VOS J, 2019, TRANSP RES PART F TRAFFIC PSYCHOL BEHAV->VAN WEE B, 2019, J TRANSP GEOGR
[8] DE VOS J, 2019, TRANSP RES PART F TRAFFIC PSYCHOL BEHAV->YE R, 2020, TRANSP RES PART A POLICY PRACT
+ ... omitted several edges
$g
$layout
$axis
NA


Other stuff
Threefield Plot
rm(M_threefield)
#M %>% authorProdOverTime(k = 10, graph = TRUE) #M %>% rpys(sep
= “;”, graph = T) #M %>% thematicMap() #M_them_evo <- M %>%
thematicEvolution(years = c(2016, 2018,2021))
Conceptual Structure
#CS <- M %>% conceptualStructure(field=“ID”, method=“CA”,
minDegree=4, clust=5, stemming=FALSE, labelsize=10, documents=10) #CS
%>% saveRDS(“../temp/CS.RDS”) #rm(CS)
mat_bib %<>% normalizeSimilarity(type = “association”) # NOTE:
We do not normalize on the biblio-network publication level
anymore.
---
title: "Luxembourg Research Evaluation 2022"
author: "Daniel S. Hain"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
  html_notebook:
    df_print: paged
    toc: no
    code_folding: hide
---

---
title: "Luxembourg Research Evaluation 2022"
author: "Daniel S. Hain"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
  pdf_document:
    toc: no
  html_notebook:
    df_print: paged
    toc: no
    code_folding: hide
---

```{r setup, include=FALSE}
### Generic preamble
rm(list=ls())
Sys.setenv(LANG = "en")
options(scipen = 5)
set.seed(1337)

### Load packages  
library(knitr) # For display of the markdown
library(kableExtra) # For table styling

library(tidyverse)
library(magrittr)

### Extra packages
library(bibliometrix)
library(tidygraph)

# own functions
source("functions/functions_basic.R")
# source("functions/00_parameters.R")
```

```{r global_options, include=FALSE}
knitr::opts_chunk$set(echo = FALSE,
                      warning = FALSE, 
                      message = FALSE)
```

```{r, include=FALSE}
var_inst <- 'LISER'
var_dept <- 'UD'
```

# Preprocessing articles

```{r}
files <- list.files(path = '../data/', pattern = paste0('scopus_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '_seed'), full.names = TRUE)
```

```{r}
# Load bibliographic data
M <- convert2df(file = files[1], dbsource = "scopus", format = "csv") %>% mutate(int_dept = TRUE) %>% 
  bind_rows(convert2df(file = files[-1], dbsource = "scopus", format = "csv") %>% mutate(int_dept = FALSE)) %>%
  # Delete duplicates (Better use EID)
  distinct(UT, .keep_all = TRUE)

rm(files)
```

```{r}
# Extract Meta Tags #TODO: Maybe more?
M %<>% metaTagExtraction(Field = "AU_CO", aff.disamb = TRUE, sep = ";")
M %<>% metaTagExtraction(Field = "AU1_CO", aff.disamb = TRUE, sep = ";")
M %<>% metaTagExtraction(Field = "AU1_UN", aff.disamb = TRUE, sep = ";")
M %<>% metaTagExtraction(Field = "SR", aff.disamb = TRUE, sep = ";")
M %<>% metaTagExtraction(Field = "CR_AU", aff.disamb = TRUE, sep = ";")
M %<>% metaTagExtraction(Field = "CR_SO", aff.disamb = TRUE, sep = ";")
```


```{r}
# create label
M %<>% rownames_to_column('XX') %>% 
  mutate(XX = paste(str_extract(XX, pattern = ".*\\d{4}"), str_sub(TI, 1,25)) %>% str_replace_all("[^[:alnum:]]", " ") %>% str_squish() %>% str_replace_all(" ", "_") %>% make.unique(sep='_')) %>%
  # Filter out
  mutate(CR_n = CR %>% str_count(';')) %>%
  filter(CR_n >= 5) %>%
  # Abstract
  filter(AB != '') %>%
  filter(AB %>% str_length() >= 25)

# Setting rownames
rownames(M) <- M$XX
```

```{r}
# Number of cited references and citations
M %<>% 
  mutate(TC_year = TC / (2023 - PY)) %>% 
  filter(TC_year >= 1 | int_dept == TRUE) 
# %>% filter(percent_rank(TC_year) >= 0.5)

```

```{r}
# Save whole compilation
M %>% saveRDS(paste0('../temp/M_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
# M <- read_rds(paste0('M_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```


# Networks Bibliographic

```{r}
mat_bib <- M  %>% biblioNetwork(analysis = "coupling", network = "references", sep = ";", shortlabel =  FALSE)
# mat_bib %>% saveRDS(paste0('../temp/mat_bib__', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
# mat_bib <- readRDS(paste0('../temp7mat_bib__', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

```{r}
g_bib <- mat_bib %>% igraph::graph_from_adjacency_matrix(mode = "undirected", weighted = TRUE, diag = FALSE) %>% 
  igraph::simplify() %>%
  as_tbl_graph(directed = FALSE) %N>% 
  left_join(M %>% select(XX, PY, CR_n, TC_year, int_dept), by = c("name" = "XX"))
```

## Restrict the network

```{r}
# Initial Filter
cutof_edge_bib <- 2
cutof_node_bib <- 5

cutof_edge_pct_bib <- 0.05
cutof_node_pct_bib <- 0.25
```

```{r}
g_bib <- g_bib %E>% 
  filter(weight >= cutof_edge_bib)

g_bib <- g_bib %N>%
  filter(!node_is_isolated()  | int_dept == TRUE) %N>%
  mutate(dgr = centrality_degree(weights = weight)) %N>% 
  filter(dgr >= cutof_node_bib | int_dept == TRUE)
```

```{r}
# Jaccard weighting
g_bib <- g_bib %E>% 
  mutate(weight_jac = weight / (.N()$CR_n[from] + .N()$CR_n[to] - weight) ) %E>%
  mutate(weight_jac = if_else(weight_jac > 1, 1, weight_jac) ) %N>%
  mutate(dgr_jac = centrality_degree(weights = weight_jac)) 
```

```{r}
# Further restrictions
g_bib <- g_bib  %N>%
  filter(percent_rank(dgr_jac) >= cutof_node_pct_bib | int_dept == TRUE) %E>% 
  filter(percent_rank(weight_jac) >= cutof_edge_pct_bib) %N>%
  filter(!node_is_isolated() | int_dept == TRUE)
```

## Community Detection

```{r}
g_bib <- g_bib %N>%
  mutate(com = group_louvain(weights = weight_jac)) %>%
  morph(to_split, com) %>% 
  mutate(dgr_int = centrality_degree(weights = weight_jac)) %N>%
  unmorph()
```

```{r}
g_bib %N>% as_tibble() %>% count(com, sort = TRUE)
```

```{r}
# community detection
com_size_bib <- 100
```

```{r}
# Community size restriction
g_bib <- g_bib %N>%
  add_count(com, name = 'com_n') %>%
  mutate(com = ifelse(com_n >= com_size_bib, com, NA) ) %>%
  select(-com_n)  

# Delete nodes withou community
g_bib <- g_bib %N>%
  filter(!is.na(com) | int_dept == TRUE)
```

```{r}
# Update degree
g_bib <- g_bib %N>%
  mutate(dgr = centrality_degree(weights = weight),
         dgr_jac = centrality_degree(weights = weight_jac))
```

```{r}
# Save the objects we need lateron
g_bib %>% saveRDS(paste0('../temp/g_bib_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

## Merge with main data

```{r}
M_bib <- M %>% select(XX) %>% inner_join(g_bib %N>% as_tibble() %>% select(name, dgr, dgr_jac, com, dgr_int), by = c('XX' = 'name')) %>%
  distinct(XX, .keep_all = TRUE) 

M_bib %>% saveRDS(paste0('../temp/M_bib_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

## Aggregated Network

```{r}
require(RNewsflow)
g_bib_agg <- g_bib %N>%
  filter(!is.na(com)) %>%
  network_aggregate(by = "com", edge_attribute = "weight_jac", agg_FUN = sum)  %>%
  as.undirected(mode = "collapse", edge.attr.comb = "sum") %>%
  as_tbl_graph(directed = FALSE) %N>%
  select(-name) %>%
  mutate(id = 1:n()) %E>%
  rename(weight = agg.weight_jac) %>%
  select(from, to, weight)
```

```{r}
## Weight edges
# g_bib_agg <- g_bib_agg %E>%
#   rename(weight_count = weight) %>%
#   mutate(weight = weight_count / (.N()$N[from] * .N()$N[to]) ) %>%
#   mutate(weight = (weight * 100) %>% round(4)) %N>%
#   mutate(dgr = centrality_degree(weights = weight))
```

```{r}
# Save the objects we need lateron
g_bib_agg %>% saveRDS(paste0('../temp/g_bib_agg_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

```{r}
# Delete all we dont need
rm(mat_bib, g_bib, com_size_bib, cutof_edge_bib, cutof_node_bib, g_bib_agg)
```


# Network Cocitation 

```{r}
mat_cit <- M %>%
  semi_join(M_bib, by = 'XX') %>%
  as.data.frame() %>% 
  biblioNetwork(analysis = "co-citation", network = "references", sep = ";", shortlabel = FALSE)
```

```{r}
mat_cit %>% saveRDS(paste0('../temp/mat_cit_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
# mat_cit <- readRDS(paste0('../temp/mat_cit_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

```{r}
g_cit <- mat_cit %>% igraph::graph_from_adjacency_matrix(mode = "undirected", weighted = TRUE, diag = FALSE) %>% 
  igraph::simplify() %>%
  as_tbl_graph(directed = FALSE) # %N>% left_join(M %>% select(XX, SR, PY, TC, J9), by = c("name" = "XX")) %>% mutate(id = 1:n()) 
```

```{r}
# Initial Filter
cutof_edge_cit <- 2
cutof_node_cit <- 5

cutof_edge_pct_cit <- 0.25
cutof_node_pct_cit <- 0.25
```

```{r}
# Restrict the network
g_cit <- g_cit %E>% 
  filter(weight >= cutof_edge_cit) %N>%
  filter(!node_is_isolated())

g_cit <- g_cit %N>%
  mutate(dgr = centrality_degree(weights = weight)) %N>%
  filter(dgr >= cutof_node_cit) 
```

```{r}
# Further restrictions
g_cit <- g_cit %N>% 
  filter(percent_rank(dgr) >= cutof_node_pct_cit) %E>%
  filter(percent_rank(weight) >= cutof_edge_pct_cit) %N>%
  filter(!node_is_isolated())
```

## Community Detection

```{r}
g_cit <- g_cit %N>%
  mutate(com = group_louvain(weights = weight)) %N>%
  morph(to_split, com) %>% 
  mutate(dgr_int = centrality_degree(weights = weight)) %>%
  unmorph()
```

```{r}
g_cit %N>% as_tibble() %>% count(com)
```

```{r}
# community detection
com_size_cit <- 250
```


```{r}
# Community size restriction
g_cit <- g_cit %N>%
  add_count(com, name = 'com_n') %>%
  mutate(com = ifelse(com_n >= com_size_cit, com, NA) ) %>%
  select(-com_n)  

# Delete nodes withou community
g_cit <- g_cit %N>%
  filter(!is.na(com))

# Update degree
g_cit <- g_cit %N>%
  mutate(dgr = centrality_degree(weights = weight))
```

```{r}
# Save the objects we need lateron
g_cit %>% saveRDS(paste0('../temp/g_cit_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

```{r}
# generate citation report
C_nw <- g_cit %N>% as_tibble() 
C_nw %>%  saveRDS(paste0('../temp/C_nw_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

## A ggregated Network

```{r}
require(RNewsflow)
g_cit_agg <- g_cit %>%
  network_aggregate(by = "com", edge_attribute = "weight", agg_FUN = sum)  %>%
  as.undirected(mode = "collapse", edge.attr.comb = "sum") %>%
  as_tbl_graph(directed = FALSE) %N>%
  select(-name) %>%
  mutate(id = 1:n()) %E>%
  rename(weight = agg.weight) %>%
  select(from, to, weight)
```

```{r}
g_cit_agg %>% saveRDS(paste0('../temp/g_cit_agg_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

```{r}
rm(mat_cit, g_cit, g_cit_agg)
```


# 2 mode network 


```{r}
rownames(M) <- M %>% pull(XX)

m_2m <- M %>% 
  semi_join(M_bib) %>%
  as.data.frame() %>% cocMatrix(Field = "CR", sep = ";", short = FALSE)
```

```{r}
g_2m <- m_2m %>% igraph::graph_from_incidence_matrix(directed = TRUE, mode = 'out', multiple = FALSE) %>% 
  igraph::simplify() 
```

```{r}
el_2m <- g_2m %>%
  get.edgelist() %>%
  as_tibble() %>%
  rename(from = V1,
         to = V2)
```

```{r}
el_2m %<>%
  left_join(M_bib %>% select(XX, com), by = c('from' = 'XX')) %>%
  rename(com_bib = com) %>%
  left_join(M %>% select(XX, PY), by = c('from' = 'XX')) %>%
  left_join(C_nw %>% select(name, com), by = c('to' = 'name')) %>%
  rename(com_cit = com) %>% 
  drop_na(PY, com_bib, com_cit)
```

```{r}
# save
el_2m %>% saveRDS(paste0('../temp/el_2m_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

```{r}
rm(m_2m, g_2m, el_2m, C_nw)
```


# Topicmodel

```{r}
library(tidytext)
library(topicmodels)
library(textstem)
```

```{r}
# Extract text to work with
text_tidy <- M %>% 
  as_tibble() %>%
  select(XX, AB) %>%
  rename(document = XX,
         text = AB) 
```

```{r}
# Some initial cleaning
text_tidy %<>% 
  mutate(text = text %>% 
           str_to_lower() %>%
           str_replace_all("&", "-and-") %>%
           str_remove_all("/(&trade;|&reg;|&copy;|&#8482;|&#174;|&#169;)/.*") %>%
           iconv(to = "UTF-8", sub = "byte") %>%
           str_remove_all("�.*") %>%
           str_remove_all('[:digit:]') %>%
           str_squish() 
  )  %>%
  drop_na() 
```

```{r}
# n grams
text_tidy %<>% 
  unnest_ngrams(term, text, ngram_delim = ' ', n_min = 1, n = 3) %>% 
  separate(term, c("word1", "word2", "word3"), sep = " ")
```

```{r}
# Stopwords
stop_words_own <- tibble(
  word =c("the", "rights","reserved" , "study", "studies", "these", "this", "paper", "result", "model", "approach", "article", "author", "method", "understand", "focus", "examine", "aim", "argue", "identify", "increase", "datum", "potential", "explore", "include", "issue", "propose", "address", "apply", "require", "analyse", "relate", "finding", "analyze", "discuss", "contribute", "publish", "involve", "draw", "lead", "exist", "set", "reduce", "create", "form", "explain", "play",  "affect", "regard", "associate", "establish", "follow", "conclude", "define", "strong", "attempt", "finally", "elsevier", "offer",  "taylor", "francis", "copyright", "springer", "wiley", "emerald", "copyright", "b.v"),
  lexicon = 'own') %>% 
  bind_rows(stop_words)

# Lemmatizing 
lemma_own <- tibble( # WORK IN THAT !!!!!!!!!!
  token = c("systems", "institutional", "technological", "national", "regional", "sustainable",    "environmental", "political", "politic", "politics"),
  lemma = c("system", "institution",   "technology",    "nation",   "region",   "sustainability", "environment", "policy", "policy", "policy"))
```

```{r}
text_tidy %<>%
  filter(!word1 %in% stop_words_own$word,
         !word2 %in% stop_words_own$word,
         !word3 %in% stop_words_own$word,
         is.na(word1) | str_length(word1) > 2,
         is.na(word2) | str_length(word2) > 2,
         is.na(word3) | str_length(word3) > 2)
```

```{r}
lemma_new <- lexicon::hash_lemmas %>% 
  filter(token != 'data') %>%
  anti_join(lemma_own, by = 'token') %>%
  bind_rows(lemma_own)
```

```{r}
text_tidy %<>%
  mutate(word1 = word1 %>% lemmatize_words(dictionary = lemma_new),
         word2 = word2 %>% lemmatize_words(dictionary = lemma_new),
         word3 = word3 %>% lemmatize_words(dictionary = lemma_new))
```

```{r}
rm(stop_words_own, lemma_own, lemma_new)
```


```{r}
# Unite all again
text_tidy %<>%
  unite(term, word1, word2, word3, na.rm = TRUE, sep = " ")
```

```{r}
# TFIDF weighting
text_tidy %<>%
  count(document, term) %>%
  bind_tf_idf(term, document, n)
```

```{r}
# TTM
text_dtm <- text_tidy %>%
  cast_dtm(document, term, n) %>% tm::removeSparseTerms(sparse = .99)
```

```{r}
# Finding nummer of topics
library("ldatuning")

find_topics <- text_dtm %>%
  FindTopicsNumber(
    topics = seq(from = 4, to = 15, by = 1),
    metrics = c("Griffiths2004", "CaoJuan2009", "Arun2010", "Deveaud2014"),
    method = "Gibbs",
    control = list(seed = 1337),
    mc.cores = 4L,
    verbose = TRUE
)

find_topics %>% FindTopicsNumber_plot() 
# LISER UD: Taking 8 or 11 topics
```

```{r}
# LDA
n_topic = 11

text_lda <- text_dtm %>% LDA(k = n_topic, method= "Gibbs", control = list(seed = 1337))
```

```{r}
### LDA Viz
library(LDAvis)
json_lda <- topicmodels_json_ldavis(fitted = text_lda, 
                                    doc_dtm = text_dtm, 
                                    method = "TSNE")
json_lda %>% serVis()
```

```{r}
# Save
text_tidy %>% saveRDS(paste0('../temp/text_tidy_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
text_lda %>% saveRDS(paste0('../temp/text_lda_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
json_lda %>% serVis(out.dir = paste0('output/LDAviz_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

```{r}
# clean up
rm(text_tidy, text_dtm, text_lda, json_lda)
```

# Local citations

```{r}
CR <- M %>% citations(sep = ";")
```

```{r}
CR %>% saveRDS(paste0('../temp/CR_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```



```{r}
#CRL <- M %>% localCitations(sep = ";") # For some reason takes forever...
#CRL %>% saveRDS(paste0('../temp/CRL_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

```{r}
rm(CR, CRL)
```


# Historical citation

```{r}
# Create a historical citation network
histResults <- M %>% histNetwork(sep = ";")
```

```{r}
histResults %>% histPlot(n = 50, size = 10, labelsize = 5)
```

```{r}
histResults %>% saveRDS(paste0('../temp/histResult_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))
```

```{r}
rm(histResults)
```

# Other stuff

## Threefield Plot

```{r}
M_threefield <- M %>% as.data.frame() %>% threeFieldsPlot(fields = c("AU", "DE", "CR_SO"), n = c(20, 20, 10))
M_threefield
```

```{r}
M_threefield %>% saveRDS(paste0('../temp/threefield_', str_to_lower(var_inst), '_', str_to_lower(var_dept), '.rds'))

```



rm(M_threefield)

#M %>% authorProdOverTime(k = 10, graph = TRUE)
#M %>% rpys(sep = ";", graph = T)
#M %>% thematicMap()
#M_them_evo <- M %>% thematicEvolution(years = c(2016, 2018,2021))


############################################################################


############################################################################
# Conceptual Structure
############################################################################

#CS <- M %>% conceptualStructure(field="ID", method="CA", minDegree=4, clust=5, stemming=FALSE, labelsize=10, documents=10)
#CS %>% saveRDS("../temp/CS.RDS")
#rm(CS)


############################################################################
# Other network levels
############################################################################
# mat_bib %<>% normalizeSimilarity(type = "association") # NOTE: We do not normalize on the biblio-network publication level anymore.

